1,986 research outputs found
Guided Stereo Matching
Stereo is a prominent technique to infer dense depth maps from images, and
deep learning further pushed forward the state-of-the-art, making end-to-end
architectures unrivaled when enough data is available for training. However,
deep networks suffer from significant drops in accuracy when dealing with new
environments. Therefore, in this paper, we introduce Guided Stereo Matching, a
novel paradigm leveraging a small amount of sparse, yet reliable depth
measurements retrieved from an external source enabling to ameliorate this
weakness. The additional sparse cues required by our method can be obtained
with any strategy (e.g., a LiDAR) and used to enhance features linked to
corresponding disparity hypotheses. Our formulation is general and fully
differentiable, thus enabling to exploit the additional sparse inputs in
pre-trained deep stereo networks as well as for training a new instance from
scratch. Extensive experiments on three standard datasets and two
state-of-the-art deep architectures show that even with a small set of sparse
input cues, i) the proposed paradigm enables significant improvements to
pre-trained networks. Moreover, ii) training from scratch notably increases
accuracy and robustness to domain shifts. Finally, iii) it is suited and
effective even with traditional stereo algorithms such as SGM.Comment: CVPR 201
Real-time self-adaptive deep stereo
Deep convolutional neural networks trained end-to-end are the
state-of-the-art methods to regress dense disparity maps from stereo pairs.
These models, however, suffer from a notable decrease in accuracy when exposed
to scenarios significantly different from the training set, e.g., real vs
synthetic images, etc.). We argue that it is extremely unlikely to gather
enough samples to achieve effective training/tuning in any target domain, thus
making this setup impractical for many applications. Instead, we propose to
perform unsupervised and continuous online adaptation of a deep stereo network,
which allows for preserving its accuracy in any environment. However, this
strategy is extremely computationally demanding and thus prevents real-time
inference. We address this issue introducing a new lightweight, yet effective,
deep stereo architecture, Modularly ADaptive Network (MADNet) and developing a
Modular ADaptation (MAD) algorithm, which independently trains sub-portions of
the network. By deploying MADNet together with MAD we introduce the first
real-time self-adaptive deep stereo system enabling competitive performance on
heterogeneous datasets.Comment: Accepted at CVPR2019 as oral presentation. Code Available
https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stere
Learning monocular depth estimation with unsupervised trinocular assumptions
Obtaining accurate depth measurements out of a single image represents a
fascinating solution to 3D sensing. CNNs led to considerable improvements in
this field, and recent trends replaced the need for ground-truth labels with
geometry-guided image reconstruction signals enabling unsupervised training.
Currently, for this purpose, state-of-the-art techniques rely on images
acquired with a binocular stereo rig to predict inverse depth (i.e., disparity)
according to the aforementioned supervision principle. However, these methods
suffer from well-known problems near occlusions, left image border, etc
inherited from the stereo setup. Therefore, in this paper, we tackle these
issues by moving to a trinocular domain for training. Assuming the central
image as the reference, we train a CNN to infer disparity representations
pairing such image with frames on its left and right side. This strategy allows
obtaining depth maps not affected by typical stereo artifacts. Moreover, being
trinocular datasets seldom available, we introduce a novel interleaved training
procedure enabling to enforce the trinocular assumption outlined from current
binocular datasets. Exhaustive experimental results on the KITTI dataset
confirm that our proposal outperforms state-of-the-art methods for unsupervised
monocular depth estimation trained on binocular stereo pairs as well as any
known methods relying on other cues.Comment: 14 pages, 7 figures, 4 tables. Accepted to 3DV 201
Urban Geology for the Enhancement of the Hypogean Geosites: the Perugia Underground (Central Italy)
AbstractUrban geology analyses natural risks and promotes geoheritage in urban areas. In the cities, characterized by a high cultural value, the hypogean artificial cavities, often present in the downtown, offer a unique opportunity to show the geological substratum. Moreover, these places could be a point of interest in urban trekking with the abiotic component of the landscape as a topic (geotourism). To investigate these areas, rigorous bibliographic research and a geomorphological assessment are the first steps, but, besides, non-invasive methods are new techniques increasingly in demand. In this paper, we present a multidisciplinary study on the Etruscan Well (third century B.C.), one of the most important Etruscan artefacts in Perugia (Umbria region, Central Italy). The characteristics of the sedimentary deposits outcropping along the perimeter walls have been collected. Moreover, to show the underground geoheritage, we provide a 3D model of the well and the surrounding area integrating a georeferenced laser scanner survey with ground-penetrating radar prospecting. We aim to obtain a tridimensional mapping of accessible internal rooms to depict the geological characteristics of the Etruscan Well, also revealing a surrounding network of buried galleries. The results are not only a meaningful advancement in the archaeological, geological and historical knowledge of the downtown of Perugia but are a hint for the geoheritage promotion and dissemination, providing images and 3D reconstruction of underground areas
GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction
Neural implicit representations have recently demonstrated compelling results
on dense Simultaneous Localization And Mapping (SLAM) but suffer from the
accumulation of errors in camera tracking and distortion in the reconstruction.
Purposely, we present GO-SLAM, a deep-learning-based dense visual SLAM
framework globally optimizing poses and 3D reconstruction in real-time. Robust
pose estimation is at its core, supported by efficient loop closing and online
full bundle adjustment, which optimize per frame by utilizing the learned
global geometry of the complete history of input frames. Simultaneously, we
update the implicit and continuous surface representation on-the-fly to ensure
global consistency of 3D reconstruction. Results on various synthetic and
real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art
approaches at tracking robustness and reconstruction accuracy. Furthermore,
GO-SLAM is versatile and can run with monocular, stereo, and RGB-D input.Comment: ICCV 2023. Code: https://github.com/youmi-zym/GO-SLAM - Project Page:
https://youmi-zym.github.io/projects/GO-SLAM
- …